Aspects concerning on the SVM Method’s Scalability
نویسندگان
چکیده
In the last years the quantity of text documents is increasing continually and automatic document classification is an important challenge. In the text document classification the training step is essential in obtaining a good classifier. The quality of learning depends on the dimension of the training data. When working with huge learning data sets, problems regarding the training time that increases exponentially are occurring. In this paper we are presenting a method that allows working with huge data sets into the training step without increasing exponentially the training time and without significantly decreasing the classification accuracy.
منابع مشابه
Unbalanced and Partial L1 Monge–Kantorovich Problem: A Scalable Parallel First-Order Method
We propose a new algorithm to solve the unbalanced and partial L1-Monge– Kantorovich problems. The proposed method is a first-order primal-dual method that is scalable and parallel. The method’s iterations are conceptually simple, computationally cheap, and easy to parallelize. We provide several numerical examples solved on a CUDA GPU, which demonstrate the method’s practical effectiveness.
متن کاملBiomedical Named Entity Recognition Using Support Vector Machines: Performance vs. Scalability Issues
This paper examines the performance and scalability of Named Entity Recognition (NER) using multi-class Support Vector Machines (SVM) and high-dimensional features. The NER domain chosen for these experiments is the biomedical publications domain, especially selected due to its importance and inherent challenges. We use a simple machine learning approach that eliminates prior language knowledge...
متن کاملTitle: Using machine learning methods to predict experimental high- throughput screening data
High-throughput screening (HTS) remains a very costly process notwithstanding many recent technological advances in the field of biotechnology. In this study we consider the application of machine learning methods for predicting experimental HTS measurements. Such a virtual HTS analysis can be based on the results of real HTS campaigns carried out with similar compounds libraries and similar dr...
متن کاملTime Properties Dedicated Semantics for UML-MARTE Safety Critical Real-Time System Verification
Critical real-time embedded systems (RTES) crucially have strong requirement concerning system’s reliability. UML and its profile MARTE are standardized modeling language that are getting widely accepted by industrial designers to cope with the development of complex RTSE. In Model-driven engineering, verification at early phases of the system lifecycle is an important problem, which remains op...
متن کاملMethod for Aspect-Based Sentiment Annotation Using Rhetorical Analysis
This paper fills a gap in aspect-based sentiment analysis and aims to present a new method for preparing and analysing texts concerning opinion and generating user-friendly descriptive reports in natural language. We present a comprehensive set of techniques derived from Rhetorical Structure Theory and sentiment analysis to extract aspects from textual opinions and then build an abstractive sum...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007